Detecting Content-Heavy Sentences: A Cross-Language Case Study
نویسندگان
چکیده
The information conveyed by some sentences would be more easily understood by a reader if it were expressed in multiple sentences. We call such sentences content heavy: these are possibly grammatical but difficult to comprehend, cumbersome sentences. In this paper we introduce the task of detecting content-heavy sentences in cross-lingual context. Specifically we develop methods to identify sentences in Chinese for which English speakers would prefer translations consisting of more than one sentence. We base our analysis and definitions on evidence from multiple human translations and reader preferences on flow and understandability. We show that machine translation quality when translating content heavy sentences is markedly worse than overall quality and that this type of sentence are fairly common in Chinese news. We demonstrate that sentence length and punctuation usage in Chinese are not sufficient clues for accurately detecting heavy sentences and present a richer classification model that accurately identifies these sentences.
منابع مشابه
Impersonal Russian Sentences with the Subject in the Accusative Case and the Meaning of a Person\'s Physical Condition in the Terms of Persian Language
In this article, considering impersonal sentences with the subject in the accusative case, which conveys the physical state of a living being, an attempt is made to compare them with the Persian correlates. This type of impersonal sentences can cause different problems for the Persian-speaking students due to their grammatical specificity (e.g. the uses of the subject in the accusative, rather ...
متن کاملPsychometric properties of the MacArthur-Bates communicative development inventories – III (CDI-III) in 30 to 37 months old Persian-speaking children
Introduction: Early language skills predict the child’s future language skills and literacy. So, screening and assessment of speech and language at an early age is important. One of the cost-effective way of child’s communication assessment is through parents reporting tools. MacArthur-Bates Communicative Development Inventories (CDI) are the most widely used forms that have been appliedby prof...
متن کاملPerception Development of Complex Syntactic Construction in Children with Hearing Impairment
Objectives: Auditory perception or hearing ability is critical for children in acquisition of language and speech hence hearing loss has different effects on individuals’ linguistic perception, and also on their functions. It seems that deaf people suffer from language and speech impairments such as in perception of complex linguistic constructions. This research was aimed to study the pe...
متن کاملReading and Assessing the City / Neighborhood FabricAs a Text. Case Study: Sar-Tapulah Historical Neighbourhood inSanandaj
From a linguistic point of view, the city can be seen as a text, consisting of different components and structures being related to each other beyond a sentence. Looking at the city from this point of view, what establishes a syntactic relationship and cohesion and coherence of the components of the city as a common language is called the syntax of the city. Linguistic study of the text of the ...
متن کاملNasalance Scores of Sentences in Children with Hearing Loss
Background and purpose: Proper resonance is a major factor for the comprehension of speech in individuals with hearing loss. These people have low speech intelligibility caused by inappropriate resonance. Therefore, nasalance measurement is a principal aspect of the assessment of people with hearing loss. This study aimed at determining nasalance in children with hearing loss. Materials and me...
متن کامل